Exact Performance of CoD Estimators in Discrete Prediction

نویسندگان

  • Ting Chen
  • Ulisses Braga-Neto
چکیده

The coefficient of determination (CoD) has significant applications in genomics, for example, in the inference of gene regulatory networks. We study several CoD estimators, based upon the resubstitution, leave-one-out, cross-validation, and bootstrap error estimators. We present an exact formulation of performance metrics for the resubstitution and leave-one-out CoD estimators, assuming the discrete histogram rule. Numerical experiments are carried out using a parametric Zipf model, where we compute exact performance metrics of resubstitution and leave-one-out CoD estimators using the previously derived equations, for varying actual CoD, sample size, and bin size. These results are compared to approximate performance metrics of 10-repeated 2-fold cross-validation and 0.632 bootstrap CoD estimators, computed via Monte Carlo sampling. The numerical results lead to a perhaps surprising conclusion: under the Zipf model under consideration, and for moderate and large values of the actual CoD, the resubstitution CoD estimator is the least biased and least variable among all CoD estimators, especially at small number of predictors. We also observed that the leave-one-out and cross-validation CoD estimators tend to perform the worst, whereas the performance of the bootstrap CoD estimator is intermediary, despite its high computational complexity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian estimation of the discrete coefficient of determination

The discrete coefficient of determination (CoD) measures the nonlinear interaction between discrete predictor and target variables and has had far-reaching applications in Genomic Signal Processing. Previous work has addressed the inference of the discrete CoD using classical parametric and nonparametric approaches. In this paper, we introduce a Bayesian framework for the inference of the discr...

متن کامل

M-estimators as GMM for Stable Laws Discretizations

This paper is devoted to "Some Discrete Distributions Generated by Standard Stable Densities" (in short, Discrete Stable Densities). The large-sample properties of M-estimators as obtained by the "Generalized Method of Moments" (GMM) are discussed for such distributions. Some corollaries are proposed. Moreover, using the respective results we demonstrate the large-sample pro...

متن کامل

On Mathematical Characteristics of some Improved Estimators of the Mean and Variance Components in Elliptically Contoured Models

In this paper we treat a general form of location model. It is typically assumed that the error term is distributed according to the law belonging to the class of elliptically contoured distribution. Some sorts of shrinkage estimators of location and scale parameters are proposed and their exact bias and MSE expressions are derived. The performance of the estimators under study are compl...

متن کامل

Prediction of the waste stabilization pond performance using linear multiple regression and multi-layer perceptron neural network: a case study of Birjand, Iran

Background: Data mining (DM) is an approach used in extracting valuable information from environmental processes. This research depicts a DM approach used in extracting some information from influent and effluent wastewater characteristic data of a waste stabilization pond (WSP) in Birjand, a city in Eastern Iran. Methods: Multiple regression (MR) and neural network (NN) models were examined u...

متن کامل

COD Removal Prediction of DAF Unit Refinery Wastewater by Using Neuro- Fuzzy Systems (ANFIS) (Short Communication)

In this study the Dissolved Air Flotation (DAF) system in oil refinery was investigated for the treatment of refinery wastewater. In order to investigate sytem a labratory scale rig was built. The aim is to remove some of the wastewater pollutant materials and data modeling of COD test.The effect of several parameters on flotation efficiency namely, saturator pressure, and coagulant dose, on CO...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Adv. Sig. Proc.

دوره 2010  شماره 

صفحات  -

تاریخ انتشار 2010